Goto

Collaborating Authors

 wifi signal


A Transformer-based Multimodal Fusion Model for Efficient Crowd Counting Using Visual and Wireless Signals

Cui, Zhe, Li, Yuli, Tran, Le-Nam

arXiv.org Artificial Intelligence

--Current crowd-counting models often rely on single-modal inputs, such as visual images or wireless signal data, which can result in significant information loss and suboptimal recognition performance. T o address these shortcomings, we propose TransFusion, a novel multimodal fusion-based crowd-counting model that integrates Channel State Information (CSI) with image data. By leveraging the powerful capabilities of Transformer networks, TransFusion effectively combines these two distinct data modalities, enabling the capture of comprehensive global contextual information that is critical for accurate crowd estimation. However, while transformers are well capable of capturing global features, they potentially fail to identify finer-grained, local details essential for precise crowd counting. T o mitigate this, we incorporate Convolutional Neural Networks (CNNs) into the model architecture, enhancing its ability to extract detailed local features that complement the global context provided by the Transformer . Extensive experimental evaluations demonstrate that TransFusion achieves high accuracy with minimal counting errors while maintaining superior efficiency.


Lessons from Deploying Learning-based CSI Localization on a Large-Scale ISAC Platform

Zhang, Tianyu, Zhang, Dongheng, Geng, Ruixu, Xie, Xuecheng, Yang, Shuai, Chen, Yan

arXiv.org Artificial Intelligence

In recent years, Channel State Information (CSI), recognized for its fine-grained spatial characteristics, has attracted increasing attention in WiFi-based indoor localization. However, despite its potential, CSI-based approaches have yet to achieve the same level of deployment scale and commercialization as those based on Received Signal Strength Indicator (RSSI). A key limitation lies in the fact that most existing CSI-based systems are developed and evaluated in controlled, small-scale environments, limiting their generalizability. To bridge this gap, we explore the deployment of a large-scale CSI-based localization system involving over 400 Access Points (APs) in a real-world building under the Integrated Sensing and Communication (ISAC) paradigm. We highlight two critical yet often overlooked factors: the underutilization of unlabeled data and the inherent heterogeneity of CSI measurements. To address these challenges, we propose a novel CSI-based learning framework for WiFi localization, tailored for large-scale ISAC deployments on the server side. Specifically, we employ a novel graph-based structure to model heterogeneous CSI data and reduce redundancy. We further design a pretext pretraining task that incorporates spatial and temporal priors to effectively leverage large-scale unlabeled CSI data. Complementarily, we introduce a confidence-aware fine-tuning strategy to enhance the robustness of localization results. In a leave-one-smartphone-out experiment spanning five floors and 25, 600 m2, we achieve a median localization error of 2.17 meters and a floor accuracy of 99.49%. This performance corresponds to an 18.7% reduction in mean absolute error (MAE) compared to the best-performing baseline.


Time-Frequency Analysis of Variable-Length WiFi CSI Signals for Person Re-Identification

Mao, Chen, Tan, Chong, Hu, Jingqi, Zheng, Min

arXiv.org Artificial Intelligence

Person re-identification (ReID), as a crucial technology in the field of security, plays an important role in security detection and people counting. Current security and monitoring systems largely rely on visual information, which may infringe on personal privacy and be susceptible to interference from pedestrian appearances and clothing in certain scenarios. Meanwhile, the widespread use of routers offers new possibilities for ReID. This letter introduces a method using WiFi Channel State Information (CSI), leveraging the multipath propagation characteristics of WiFi signals as a basis for distinguishing different pedestrian features. We propose a two-stream network structure capable of processing variable-length data, which analyzes the amplitude in the time domain and the phase in the frequency domain of WiFi signals, fuses time-frequency information through continuous lateral connections, and employs advanced objective functions for representation and metric learning. Tested on a dataset collected in the real world, our method achieves 93.68% mAP and 98.13% Rank-1.


Vision Reimagined: AI-Powered Breakthroughs in WiFi Indoor Imaging

Shi, Jianyang, Zhang, Bowen, Dubey, Amartansh, Murch, Ross, Jing, Liwen

arXiv.org Artificial Intelligence

Indoor imaging is a critical task for robotics and internet-of-things. WiFi as an omnipresent signal is a promising candidate for carrying out passive imaging and synchronizing the up-to-date information to all connected devices. This is the first research work to consider WiFi indoor imaging as a multi-modal image generation task that converts the measured WiFi power into a high-resolution indoor image. Our proposed WiFi-GEN network achieves a shape reconstruction accuracy that is 275% of that achieved by physical model-based inversion methods. Additionally, the Frechet Inception Distance score has been significantly reduced by 82%. To examine the effectiveness of models for this task, the first large-scale dataset is released containing 80,000 pairs of WiFi signal and imaging target. Our model absorbs challenges for the model-based methods including the non-linearity, ill-posedness and non-certainty into massive parameters of our generative AI network. The network is also designed to best fit measured WiFi signals and the desired imaging output. For reproducibility, we will release the data and code upon acceptance.


RoboFiSense: Attention-Based Robotic Arm Activity Recognition with WiFi Sensing

Zandi, Rojin, Behzad, Kian, Motamedi, Elaheh, Salehinejad, Hojjat, Siami, Milad

arXiv.org Artificial Intelligence

Despite the current surge of interest in autonomous robotic systems, robot activity recognition within restricted indoor environments remains a formidable challenge. Conventional methods for detecting and recognizing robotic arms' activities often rely on vision-based or light detection and ranging (LiDAR) sensors, which require line-of-sight (LoS) access and may raise privacy concerns, for example, in nursing facilities. This research pioneers an innovative approach harnessing channel state information (CSI) measured from WiFi signals, subtly influenced by the activity of robotic arms. We developed an attention-based network to classify eight distinct activities performed by a Franka Emika robotic arm in different situations. Our proposed bidirectional vision transformer-concatenated (BiVTC) methodology aspires to predict robotic arm activities accurately, even when trained on activities with different velocities, all without dependency on external or internal sensors or visual aids. Considering the high dependency of CSI data to the environment, motivated us to study the problem of sniffer location selection, by systematically changing the sniffer's location and collecting different sets of data. Finally, this paper also marks the first publication of the CSI data of eight distinct robotic arm activities, collectively referred to as RoboFiSense. This initiative aims to provide a benchmark dataset and baselines to the research community, fostering advancements in the field of robotics sensing.


Robot Motion Prediction by Channel State Information

Zandi, Rojin, Salehinejad, Hojjat, Behzad, Kian, Motamedi, Elaheh, Siami, Milad

arXiv.org Artificial Intelligence

Autonomous robotic systems have gained a lot of attention, in recent years. However, accurate prediction of robot motion in indoor environments with limited visibility is challenging. While vision-based and light detection and ranging (LiDAR) sensors are commonly used for motion detection and localization of robotic arms, they are privacy-invasive and depend on a clear line-of-sight (LOS) for precise measurements. In cases where additional sensors are not available or LOS is not possible, these technologies may not be the best option. This paper proposes a novel method that employs channel state information (CSI) from WiFi signals affected by robotic arm motion. We developed a convolutional neural network (CNN) model to classify four different activities of a Franka Emika robotic arm. The implemented method seeks to accurately predict robot motion even in scenarios in which the robot is obscured by obstacles, without relying on any attached or internal sensors.


Machine Learning Improves Mesh Networks & Fights Dead Zones

#artificialintelligence

We have talked about the impact that machine learning has had on website and app development. However, machine learning technology can also help solve Internet problems on a more granular level. A growing number of people have complained about WiFI dead zones. Fortunately, machine learning technology shows some promise in addressing them. One of the benefits of machine learning is that it can help improve mesh networks, which can minimize the risk of Internet connectivity problems.


CMU's DensePose From WiFi: An Affordable, Accessible and Secure Approach to Human Sensing

#artificialintelligence

The recent and rapid development of powerful machine learning models for computer vision has boosted 2D and 3D human pose estimation performance from RGB cameras, LiDAR, and radar inputs. These approaches however can require expensive and power-hungry hardware and have raised privacy concerns regarding their deployment in non-public areas. A Carnegie Mellon University research team addresses these issues in the new paper DensePose From WiFi, proposing WiFi-based DensePose, a neural network architecture that uses only WiFi signals for human dense pose estimation in scenarios with occlusion and multiple people. The researchers believe their work could have practical applications in monitoring the well-being of elderly people or identifying suspicious behaviours in the home. DensePose was introduced in 2018 and aims to map human pixels in an RGB image to the 3D surface of the human body.


full body tracking with WiFi signals by utilizing deep learning architectures : AR_MR_XR

#artificialintelligence

Advances in computer vision and machine learning techniques have led to significant development in 2D and 3D human pose estimation from RGB cameras, LiDAR, and radars. However, human pose estimation from images is adversely affected by occlusion and lighting, which are common in many scenarios of interest. Radar and LiDAR technologies, on the other hand, need specialized hardware that is expensive and power-intensive. Furthermore, placing these sensors in non-public areas raises significant privacy concerns. To address these limitations, recent research has explored the use of WiFi antennas (1D sensors) for body segmentation and key-point body detection.


BeSense: Leveraging WiFi Channel Data and Computational Intelligence for Behavior Analysis

Gu, Yu, Zhang, Xiang, Liu, Zhi, Ren, Fuji

arXiv.org Artificial Intelligence

The ever evolving informatics technology has gradually bounded human and computer in a compact way. Understanding user behavior becomes a key enabler in many fields such as sedentary-related healthcare, human-computer interaction (HCI) and affective computing. Traditional sensor-based and vision-based user behavior analysis approaches are obtrusive in general, hindering their usage in realworld. Therefore, in this article, we first introduce WiFi signal as a new source instead of sensor and vision for unobtrusive user behaviors analysis. Then we design BeSense, a contactless behavior analysis system leveraging signal processing and computational intelligence over WiFi channel state information (CSI). We prototype BeSense on commodity low-cost WiFi devices and evaluate its performance in realworld environments. Experimental results have verified its effectiveness in recognizing user behaviors.